63 research outputs found

    Generative Knowledge Selection for Knowledge-Grounded Dialogues

    Full text link
    Knowledge selection is the key in knowledge-grounded dialogues (KGD), which aims to select an appropriate knowledge snippet to be used in the utterance based on dialogue history. Previous studies mainly employ the classification approach to classify each candidate snippet as "relevant" or "irrelevant" independently. However, such approaches neglect the interactions between snippets, leading to difficulties in inferring the meaning of snippets. Moreover, they lack modeling of the discourse structure of dialogue-knowledge interactions. We propose a simple yet effective generative approach for knowledge selection, called GenKS. GenKS learns to select snippets by generating their identifiers with a sequence-to-sequence model. GenKS therefore captures intra-knowledge interaction inherently through attention mechanisms. Meanwhile, we devise a hyperlink mechanism to model the dialogue-knowledge interactions explicitly. We conduct experiments on three benchmark datasets, and verify GenKS achieves the best results on both knowledge selection and response generation.Comment: Findings of EACL-2

    Towards Empathetic Dialogue Generation over Multi-type Knowledge

    Full text link
    Enabling the machines with empathetic abilities to provide context-consistent responses is crucial on both semantic and emotional levels. The task of empathetic dialogue generation is proposed to address this problem. However, lacking external knowledge makes it difficult to perceive implicit emotions from limited dialogue history. To address the above challenges, we propose to leverage multi-type knowledge, i.e, the commonsense knowledge and emotional lexicon, to explicitly understand and express emotions in empathetic dialogue generation. We first enrich the dialogue history by jointly interacting with two-type knowledge and construct an emotional context graph. Then we introduce a multi-type knowledge-aware context encoder to learn emotional context representations and distill emotional signals, which are the prerequisites to predicate emotions expressed in responses. Finally, we propose an emotional cross-attention mechanism to exploit the emotional dependencies between the emotional context graph and the target empathetic response. Conducted on a benchmark dataset, extensive experimental results show that our proposed framework outperforms state-of-the-art baselines in terms of automatic metrics and human evaluations.Comment: arXiv admin note: text overlap with arXiv:1911.0869

    Is ChatGPT Good at Search? Investigating Large Language Models as Re-Ranking Agent

    Full text link
    Large Language Models (LLMs) have demonstrated a remarkable ability to generalize zero-shot to various language-related tasks. This paper focuses on the study of exploring generative LLMs such as ChatGPT and GPT-4 for relevance ranking in Information Retrieval (IR). Surprisingly, our experiments reveal that properly instructed ChatGPT and GPT-4 can deliver competitive, even superior results than supervised methods on popular IR benchmarks. Notably, GPT-4 outperforms the fully fine-tuned monoT5-3B on MS MARCO by an average of 2.7 nDCG on TREC datasets, an average of 2.3 nDCG on eight BEIR datasets, and an average of 2.7 nDCG on ten low-resource languages Mr.TyDi. Subsequently, we delve into the potential for distilling the ranking capabilities of ChatGPT into a specialized model. Our small specialized model that trained on 10K ChatGPT generated data outperforms monoT5 trained on 400K annotated MS MARCO data on BEIR. The code to reproduce our results is available at www.github.com/sunnweiwei/RankGP

    Towards Explainable Conversational Recommender Systems

    Full text link
    Explanations in conventional recommender systems have demonstrated benefits in helping the user understand the rationality of the recommendations and improving the system's efficiency, transparency, and trustworthiness. In the conversational environment, multiple contextualized explanations need to be generated, which poses further challenges for explanations. To better measure explainability in conversational recommender systems (CRS), we propose ten evaluation perspectives based on concepts from conventional recommender systems together with the characteristics of CRS. We assess five existing CRS benchmark datasets using these metrics and observe the necessity of improving the explanation quality of CRS. To achieve this, we conduct manual and automatic approaches to extend these dialogues and construct a new CRS dataset, namely Explainable Recommendation Dialogues (E-ReDial). It includes 756 dialogues with over 2,000 high-quality rewritten explanations. We compare two baseline approaches to perform explanation generation based on E-ReDial. Experimental results suggest that models trained on E-ReDial can significantly improve explainability while introducing knowledge into the models can further improve the performance. GPT-3 in the in-context learning setting can generate more realistic and diverse movie descriptions. In contrast, T5 training on E-ReDial can better generate clear reasons for recommendations based on user preferences. E-ReDial is available at https://github.com/Superbooming/E-ReDial

    Learning to Ask Conversational Questions by Optimizing Levenshtein Distance

    Get PDF
    Conversational Question Simplification (CQS) aims to simplify self-contained questions into conversational ones by incorporating some conversational characteristics, e.g., anaphora and ellipsis. Existing maximum likelihood estimation (MLE) based methods often get trapped in easily learned tokens as all tokens are treated equally during training. In this work, we introduce a Reinforcement Iterative Sequence Editing (RISE) framework that optimizes the minimum Levenshtein distance (MLD) through explicit editing actions. RISE is able to pay attention to tokens that are related to conversational characteristics. To train RISE, we devise an Iterative Reinforce Training (IRT) algorithm with a Dynamic Programming based Sampling (DPS) process to improve exploration. Experimental results on two benchmark datasets show that RISE significantly outperforms state-of-the-art methods and generalizes well on unseen data.Comment: 13 pages, 4 figures, Published in ACL 202
    • …
    corecore